Verb Clustering for Brazilian Portuguese
نویسندگان
چکیده
Levin-style classes which capture the shared syntax and semantics of verbs have proven useful for many Natural Language Processing (NLP) tasks and applications. However, lexical resources which provide information about such classes are only available for a handful of worlds languages. Because manual development of such resources is extremely time consuming and cannot reliably capture domain variation in classification, methods for automatic induction of verb classes from texts have gained popularity. However, to date such methods have been applied to English and a handful of other, mainly resource-rich languages. In this paper, we apply the methods to Brazilian Portuguese a language for which no VerbNet or automatic class induction work exists yet. Since Levinstyle classification is said to have a strong cross-linguistic component, we use unsupervised clustering techniques similar to those developed for English without language-specific feature engineering. This yields interesting results which line up well with those obtained for other languages, demonstrating the crosslinguistic nature of this type of classification. However, we also discover and discuss issues which require specific consideration when aiming to optimise the performance of verb clustering for Brazilian Portuguese and other less-resourced languages.
منابع مشابه
‘Minor’ Languages, ‘Broken’ Translations: On Brazilian Reworkings of an Albanian Novel
This essay approaches the challenges of global translation in the 21st century from what might still be considered a somewhat uncommon example: a direct translation of Ismail Kadaré's 1978 novel Prill e thyër (Broken April) from the original Albanian into Brazilian Portuguese in 2001. Not only does it examine and compare lexical elements in the source and target texts and the usage of translato...
متن کاملMarcação de tempo por surdos sinalizadores brasileiros*** Tense marking by Brazilian deaf signers
Background: tense marking by deaf signers, in the Brazilian Sign Language and in written Portuguese. Aim: to analyze verbal tense inflection in written Portuguese, to verify the relationship between the performance in using verbal tense inflexion and the educational status, and to verify tense marking in the production of sentences in the Brazilian sign language and in written Portuguese. Metho...
متن کاملThe fuzzy boundaries of operator verb and support verb constructions with dar "give" and ter "have" in Brazilian Portuguese
This paper describes the fuzzy boundaries between support verb constructions (SVC) with ter “have” and dar “give” and causative operator verb (VopC) constructions involving these same verbs, in Brazilian Portuguese (BP), which form a complex set of relations: (i) both verbs are the support verb of the same noun (SVC); (ii) dar is the standard (active-like) SVC while ter is a converse (passive-l...
متن کاملIntegrating support verb constructions into a parser
This paper describes the process of integrating into a rule-based parser a set of approximately 1,000 nominal predicates forming support verb constructions (SVC) with the verb dar ‘give’ in Brazilian Portuguese. The system was evaluated on a sample of 580 sentences containing verb-noun combinations candidates to SVC, manually and independently annotated. Best results yield 85% precision, 79% re...
متن کاملMaking Virtue of Necessity: A Verb Lexicon
We describe the verb lexicon of OpenWordNet-PT, a wordnetlike resource for (mostly Brazilian) Portuguese and a series of experiments that we designed to extend its coverage. These experiments include checking online lists of most common verbs, checking corpora freely available such as the Bosque-UD (the Bosque corpus annotated with Universal Dependencies) and especially checking a dictionary of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014